Goto

Collaborating Authors

 computation task


Direct Semantic Communication Between Large Language Models via Vector Translation

Yang, Fu-Chun, Eshraghian, Jason

arXiv.org Artificial Intelligence

When two Large Language Models (LLMs) debate an answer, critique each other's chain of thought, or sequentially refine a shared draft of text, they speak through plain tokens. Every round forces each model to flatten rich geometry into text, operate on that, then rebuild meaning. Ultimately, computational resources are wasted, and limited information bandwidth can erase nuance. Specialised LLMs thus operate in isolation, communication only through text interfaces that constrain information transfer and add overhead. Encoding semantics into tokens and re-decoding them discards much of the latent structure that models use internally, blurring complex relationships in the process. Yet each LLM carries a distinct internal representation space shaped by architecture, training objective, and data. Those spaces differ enough that raw vectors are not interchangeable, prompting the question: Can semantic information encoded in one model's vector space be translated so another model can use them directly? We demonstrate this is possible by learning bidirectional vector translations that create a latent bridge between models. Injecting these translated vectors directly into a target model's pipeline lets the pair share meaning without serialising to tokens, enabling chains, ensembles, and parallel collaborations to run at latent speed, and bypass text-based limitations.


FAIR: Facilitating Artificial Intelligence Resilience in Manufacturing Industrial Internet

Zeng, Yingyan, Lourentzou, Ismini, Deng, Xinwei, Jin, Ran

arXiv.org Artificial Intelligence

Artificial intelligence (AI) systems have been increasingly adopted in the Manufacturing Industrial Internet (MII). Investigating and enabling the AI resilience is very important to alleviate profound impact of AI system failures in manufacturing and Industrial Internet of Things (IIoT) operations, leading to critical decision making. However, there is a wide knowledge gap in defining the resilience of AI systems and analyzing potential root causes and corresponding mitigation strategies. In this work, we propose a novel framework for investigating the resilience of AI performance over time under hazard factors in data quality, AI pipelines, and the cyber-physical layer. The proposed method can facilitate effective diagnosis and mitigation strategies to recover AI performance based on a multimodal multi-head self latent attention model. The merits of the proposed method are elaborated using an MII testbed of connected Aerosol Jet Printing (AJP) machines, fog nodes, and Cloud with inference tasks via AI pipelines.


Intent-Aware DRL-Based Uplink Dynamic Scheduler for 5G-NR

Mostafa, Salwa, Mota, Mateus P., Valcarce, Alvaro, Bennis, Mehdi

arXiv.org Artificial Intelligence

We investigate the problem of supporting Industrial Internet of Things user equipment (IIoT UEs) with intent (i.e., requested quality of service (QoS)) and random traffic arrival. A deep reinforcement learning (DRL) based centralized dynamic scheduler for time-frequency resources is proposed to learn how to schedule the available communication resources among the IIoT UEs. The proposed scheduler leverages an RL framework to adapt to the dynamic changes in the wireless communication system and traffic arrivals. Moreover, a graph-based reduction scheme is proposed to reduce the state and action space of the RL framework to allow fast convergence and a better learning strategy. Simulation results demonstrate the effectiveness of the proposed intelligent scheduler in guaranteeing the expressed intent of IIoT UEs compared to several traditional scheduling schemes, such as round-robin, semi-static, and heuristic approaches. The proposed scheduler also outperforms the contention-free and contention-based schemes in maximizing the number of successfully computed tasks.


Emergent Communication Protocol Learning for Task Offloading in Industrial Internet of Things

Mostafa, Salwa, Mota, Mateus P., Valcarce, Alvaro, Bennis, Mehdi

arXiv.org Artificial Intelligence

In this paper, we leverage a multi-agent reinforcement learning (MARL) framework to jointly learn a computation offloading decision and multichannel access policy with corresponding signaling. Specifically, the base station and industrial Internet of Things mobile devices are reinforcement learning agents that need to cooperate to execute their computation tasks within a deadline constraint. We adopt an emergent communication protocol learning framework to solve this problem. The numerical results illustrate the effectiveness of emergent communication in improving the channel access success rate and the number of successfully computed tasks compared to contention-based, contention-free, and no-communication approaches. Moreover, the proposed task offloading policy outperforms remote and local computation baselines.


QOCO: A QoE-Oriented Computation Offloading Algorithm based on Deep Reinforcement Learning for Mobile Edge Computing

Rahmati, Iman, Shah-Mansouri, Hamed, Movaghar, Ali

arXiv.org Artificial Intelligence

In the realm of mobile edge computing (MEC), efficient computation task offloading plays a pivotal role in ensuring a seamless quality of experience (QoE) for users. Maintaining a high QoE is paramount in today's interconnected world, where users demand responsive and reliable services. This challenge stands as one of the most primary key factors contributing to handling dynamic and uncertain mobile environment. In this study, we delve into computation offloading in MEC systems, where strict task processing deadlines and energy constraints can adversely affect the system performance. We formulate the computation task offloading problem as a Markov decision process (MDP) to maximize the long-term QoE of each user individually. We propose a decentralized QoE-oriented computation offloading (QOCO) algorithm based on deep reinforcement learning (DRL) that empowers mobile devices to make their offloading decisions without requiring knowledge of decisions made by other devices. Through numerical studies, we evaluate the performance of QOCO. Simulation results validate that the QOCO algorithm efficiently exploits the computational resources of edge nodes. Consequently, it can complete 14% more tasks and reduce task delay and energy consumption by 9% and 6%, respectively. These together contribute to a significant improvement of at least 37% in average QoE compared to an existing algorithm.


Deep Reinforcement Learning for Stochastic Computation Offloading in Digital Twin Networks

Dai, Yueyue, Zhang, Ke, Maharjan, Sabita, Zhang, Yan

arXiv.org Artificial Intelligence

The rapid development of Industrial Internet of Things (IIoT) requires industrial production towards digitalization to improve network efficiency. Digital Twin is a promising technology to empower the digital transformation of IIoT by creating virtual models of physical objects. However, the provision of network efficiency in IIoT is very challenging due to resource-constrained devices, stochastic tasks, and resources heterogeneity. Distributed resources in IIoT networks can be efficiently exploited through computation offloading to reduce energy consumption while enhancing data processing efficiency. In this paper, we first propose a new paradigm Digital Twin Networks (DTN) to build network topology and the stochastic task arrival model in IIoT systems. Then, we formulate the stochastic computation offloading and resource allocation problem to minimize the long-term energy efficiency. As the formulated problem is a stochastic programming problem, we leverage Lyapunov optimization technique to transform the original problem into a deterministic per-time slot problem. Finally, we present Asynchronous Actor-Critic (AAC) algorithm to find the optimal stochastic computation offloading policy. Illustrative results demonstrate that our proposed scheme is able to significantly outperforms the benchmarks.


Information Freshness-Aware Task Offloading in Air-Ground Integrated Edge Computing Systems

Chen, Xianfu, Wu, Celimuge, Chen, Tao, Liu, Zhi, Zhang, Honggang, Bennis, Mehdi, Liu, Hang, Ji, Yusheng

arXiv.org Machine Learning

This paper studies the problem of information freshness-aware task offloading in an air-ground integrated multi-access edge computing system, which is deployed by an infrastructure provider (InP). A third-party real-time application service provider provides computing services to the subscribed mobile users (MUs) with the limited communication and computation resources from the InP based on a long-term business agreement. Due to the dynamic characteristics, the interactions among the MUs are modelled by a non-cooperative stochastic game, in which the control policies are coupled and each MU aims to selfishly maximize its own expected long-term payoff. To address the Nash equilibrium solutions, we propose that each MU behaves in accordance with the local system states and conjectures, based on which the stochastic game is transformed into a single-agent Markov decision process. Moreover, we derive a novel online deep reinforcement learning (RL) scheme that adopts two separate double deep Q-networks for each MU to approximate the Q-factor and the post-decision Q-factor. Using the proposed deep RL scheme, each MU in the system is able to make decisions without a priori statistical knowledge of dynamics. Numerical experiments examine the potentials of the proposed scheme in balancing the age of information and the energy consumption.


Distributed Gradient Descent with Coded Partial Gradient Computations

Ozfatura, Emre, Ulukus, Sennur, Gunduz, Deniz

arXiv.org Machine Learning

Coded computation techniques provide robustness against straggling servers in distributed computing, with the following limitations: First, they increase decoding complexity. Second, they ignore computations carried out by straggling servers; and they are typically designed to recover the full gradient, and thus, cannot provide a balance between the accuracy of the gradient and per-iteration completion time. Here we introduce a hybrid approach, called coded partial gradient computation (CPGC), that benefits from the advantages of both coded and uncoded computation schemes, and reduces both the computation time and decoding complexity.


Speeding Up Distributed Gradient Descent by Utilizing Non-persistent Stragglers

Ozfatura, Emre, Gunduz, Deniz, Ulukus, Sennur

arXiv.org Machine Learning

Abstract--Distributed gradient descent (DGD) is an efficient way of implementing gradient descent (GD), especially for large data sets, by dividing the computation tasks into smaller sub-tasks and assigning to different computing servers (CSs) to be executed in parallel. In standard parallel execution, per-iteration waiting time is limited by the execution time of the straggling servers. Coded DGD techniques have been introduced recently, which can tolerate straggling servers via assigning redundant computation tasks to the CSs. In most of the existing DGD schemes, either with coded computation or coded communication, the non-straggling CSs transmit one message per iteration once they complete all their assigned computation tasks. However, although the straggling servers cannot complete all their assigned tasks, they are often able to complete a certain portion of them. In this paper, we allow multiple transmissions from each CS at each iteration in order to make sure a maximum number of completed computations can be reported to the aggregating server (AS), including the straggling servers. We numerically show that the average completion time per iteration can be reduced significantly by slightly increasing the communication load per server . Index Terms --Distributed gradient descent, coded computation, coded gradient, polynomial codes, maximum-distance separable codes.


Optimized Computation Offloading Performance in Virtual Edge Computing Systems via Deep Reinforcement Learning

Chen, Xianfu, Zhang, Honggang, Wu, Celimuge, Mao, Shiwen, Ji, Yusheng, Bennis, Mehdi

arXiv.org Artificial Intelligence

To improve the quality of computation experience for mobile devices, mobile-edge computing (MEC) is a promising paradigm by providing computing capabilities in close proximity within a sliced radio access network (RAN), which supports both traditional communication and MEC services. Nevertheless, the design of computation offloading policies for a virtual MEC system remains challenging. Specifically, whether to execute a computation task at the mobile device or to offload it for MEC server execution should adapt to the time-varying network dynamics. In this paper, we consider MEC for a representative mobile user in an ultra-dense sliced RAN, where multiple base stations (BSs) are available to be selected for computation offloading. The problem of solving an optimal computation offloading policy is modelled as a Markov decision process, where our objective is to maximize the long-term utility performance whereby an offloading decision is made based on the task queue state, the energy queue state as well as the channel qualities between MU and BSs. To break the curse of high dimensionality in state space, we first propose a double deep Q-network (DQN) based strategic computation offloading algorithm to learn the optimal policy without knowing a priori knowledge of network dynamics. Then motivated by the additive structure of the utility function, a Q-function decomposition technique is combined with the double DQN, which leads to novel learning algorithm for the solving of stochastic computation offloading. Numerical experiments show that our proposed learning algorithms achieve a significant improvement in computation offloading performance compared with the baseline policies.